This report shows the results at the protein level of the Premature vs Normal Aging (2) proteomics project.
Disclaimer: All text and figures may be directly reproduced for publication. However, further interpretation and explanation of results is required during manuscript preparation.
Disclaimer: Sample preparation, mass spectrometry, and protein identification and quantitation methods are not covered in this text.
This project was a case-control study which involved the analysis of 8 Mus musculus samples distributed in 4 groups (N = 2 per group). Young WT was the control group and Old WT, Ht pro, Hm pro were the case groups.
| label | replicate | group | color | control | sample |
|---|---|---|---|---|---|
| 113 | 1 | Young WT | black | 1 | Young WT 1 |
| 114 | 2 | Young WT | black | 1 | Young WT 2 |
| 115 | 1 | Old WT | grey60 | 0 | Old WT 1 |
| 116 | 2 | Old WT | grey60 | 0 | Old WT 2 |
| 117 | 1 | Ht pro | chocolate | 0 | Ht pro 1 |
| 118 | 2 | Ht pro | chocolate | 0 | Ht pro 2 |
| 119 | 1 | Hm pro | red | 0 | Hm pro 1 |
| 121 | 2 | Hm pro | red | 0 | Hm pro 2 |
Comparative analysis of protein abundance changes was conducted with the SanXoT software package (Trevisan-Herraz et al., 2019), which was designed for the statistical analysis of high-throughput, quantitative proteomics experiments and is based on the Weighted Scan-Peptide-Protein (WSPP) statistical model (Navarro et al., 2014). As input, WSPP uses a list of quantifications in the form of log2-ratios (each condition versus the mean of all samples) with their statistical weights. From these, WSPP generates the standardized forms of the original variables by computing the quantitative values expressed in units of standard deviation around the means (z-scores). This normalization was verified using quantile-quantile (qq) plots. Replicate z-scores were averaged per group (mean z-scores) to assess between-group variability. The difference between case mean z-score and control mean z-score (∆ mean z-score) was used for comparisons with the control group. Proteins quantified with only 1 unique peptide were excluded from further analysis.
Within-group and between-group variability was assessed in the samples by fitting unsupervised Machine Learning algorithms (hierarchical clustering and principal component analysis: PCA) and by pairwise correlations between samples (Emerson et al., 2012). Statistical differences between groups were estimated using linear models from the R package “limma” (Ritchie et al., 2015). R version 4.1.0 (2021-05-18) was used. A significance threshold was established by comparing z-scores and p values in volcanic plots. The selected threshold was the absolute ∆ mean z-score that maximized true positive detections in relation to false positives and false negatives (measured by F1-score), given \(\alpha\) = 0.01, 0.05. As in this study there were multiple case groups, the mean F1-score between conditions was calculated. Known artifact proteins such as keratin contaminants, trypsin, and several major serum proteins (albumin, globins, serpins and complement factors) were excluded from the datasets after the analysis.
Gene set enrichment analysis (GSEA) was performed using the Gene Ontology Biological Process (GO-BP) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases, and the R package “clusterProfiler” (Wu et al., 2021). Allowed set sizes were between 10 and 500, p values were adjusted by the Benjamini-Hochberg procedure, and a seed was set to ensure reproducibility.
A total of 7102 proteins were detected and quantified at FDR < 1%. Of these proteins, 4478 were quantified with more than one unique peptide.
Correct sample standardization was verified by assessing normality in qq plots. Proteins quantified with only 1 unique peptide were excluded from further analysis.
Figure 2.1: Quantile-quantile (qq) plots for normality verification. The shaded area represents the 95% confidence intervals of the regression line (in red).
Note: The normality condition is not met when the points (excepting extreme values) significantly diverge from the regression line. If this is the case, protein quantitation must be reviewed. This step is merely a checkpoint and may not be included in the publication.
Within-group and between-group variability was assessed by hierarchical clustering and principal component analysis (PCA). The cumulative proportion of variance explained by the first 3 components was 0.35309, 0.58631, 0.7336.
Figure 2.2: Sample hierarchical clustering.
Figure 2.3: Sample Principal Component Analysis (PCA). PC1, 2, 3: PCA component 1, 2, 3.
Note: Basal variability is expected between replicate samples (within-group variability), specially between biological replicates. Differences between conditions (between-group variability) is expected to be greater, so replicates should cluster together and each group form an independent cluster. If this is not the case, this could mean there is a sampling problem or that there are no significant proteomic differences between groups. However, other clustering algorithms or PCA components may be investigated.
Between-sample variability was further assessed by analyzing the correlation between all sample pair cobinations.
Figure 2.4: Pairwise correlation plots to assess between-sample variability. Corr: Pearson correlation coefficient.
Note: Correlations are expected to be higher between samples of the same group for the reasons stated before. The most relevant correlations may be reported in the article text.
A significance threshold was established by comparing z-scores from the WSPP model and p values from limma models in volcanic plots. Values of \(\alpha\) = 0.05 and z = \(\pm\) 1.6 were found to maximize true positives in relation to false positives and false negatives (F1 = 0.62). Therefore, protein abundance changes were considered significant when the absolute ∆ mean z-score (|∆ mean z|) was greater or equal to 1.6.
Figure 3.1: Volcano plots for statistical significance threshold determination. The x axis represents mean z-scores from the WSPP model for each condition, and the y axis, p values from the limma models. The horizontal line is the optimal \(\alpha\) threshold and the vertical lines are the z-score threshold that maximize F1-score. Red points: true positives; blue points: false positives; green points: false negatives; grey points: true negatives.
Note: Arbitrary thresholds may be selected instead of following this approach.
After filtering by proteins quantified with more than one unique peptide, known artifact proteins such as keratin contaminants, trypsin, and several major serum proteins (albumin, globins, serpins and complement factors) were excluded from the datasets in the following analyses (71 proteins).
Taking Young WT as a reference, the number of altered proteins was analyzed in each group.
Figure 3.2: Barplot representing the number of differentially expressed proteins in each condition.
Figure 3.3: Euler diagram showing common dysregulated proteins in each condition.
Figure 3.4: Euler diagram showing common upregulated proteins in each condition.
Figure 3.5: Euler diagram showing common downregulated proteins in each condition.
Note: The most relevant frequencies may be reported in the article text.
A curated list of differentially abundant proteins is provided as a heat map. For cleaner results, only proteins with at least one |∆ mean z| ≥ 1.6 and p value < 0.05 were included.
Figure 3.6: Relative abundance changes in each sample. Heat map of curated protein candidates with altered expression. z indicates protein abundance compared with Young WT. Selected proteins were quantified with > 1 unique peptide and had at least one |∆ mean z| ≥ 1.6 and p value < 0.05. Color intensity is maximal when |z| ≥ 3.
Note: The complete table of protein results is provided as a separate file, which contains:
GSEA was performed using both the GO and KEGG databases.
Figure 4.1: Enriched GO categories. Protein relative abundance change distributions in categories with significant enrichment (outlined in black). z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Note: When there are multiple group analyses, categories with non-significant enrichment (according to p.adjust) may be present for certain groups. These are outlined in light grey, as opposed to significantly enriched categories (outlined in black).
Figure 4.2: Enriched GO category concept network per group. z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Figure 4.3: Enriched GO category concept network per group. z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Figure 4.4: Enriched GO category concept network per group. z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Figure 4.5: Enriched KEGG categories. Protein relative abundance change distributions in categories with significant enrichment (outlined in black). z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Note: When there are multiple group analyses, categories with non-significant enrichment (according to p.adjust) may be present for certain groups. These are outlined in light grey, as opposed to significantly enriched categories (outlined in black).
Figure 4.6: Enriched KEGG category concept network per group. z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Figure 4.7: Enriched KEGG category concept network per group. z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.
Figure 4.8: Enriched KEGG category concept network per group. z indicates protein abundance compared with Young WT. Color intensity is maximal when |z| ≥ 3.